Building Diversified Multiple Trees for Classification in High Dimensional Noise Data

نویسندگان

  • Jiuyong Li
  • Lin Liu
  • Jixue Liu
  • Ryan Green
چکیده

It is common that a trained classification model is applied to the operating data that is deviated from the training data because of noise. This paper demonstrate an ensemble classifier, Diversified Multiple Trees (DMT) is more robust to classify noised data than other widely used ensemble methods. DMT is tested on three real world biological data sets from different laboratories in comparison with four benchmark ensemble classifiers. Experimental results show that DMT is significantly more accurate than other benchmark ensemble classifiers on noised test data. We also discussed a limitation of DMT and its possible variations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge Discovery through SysFor - a Systematically Developed Forest of Multiple Decision Trees

Decision tree based classification algorithms like C4.5 and Explore build a single tree from a data set. The two main purposes of building a decision tree are to extract various patterns/logic-rules existing in a data set, and to predict the class attribute value of an unlabeled record. Sometimes a set of decision trees, rather than just a single tree, is also generated from a data set. A set o...

متن کامل

A Robust Ensemble Classification Method for Microarray Data Analysis

Apart from the dimensionality problem, the uncertainty of Microarray data quality is another major challenge of Microarray classification. Microarray data contains various levels of noise and quite often are high levels of noise, and these data lead to unreliable and low accuracy analysis as well as the high dimensionality problem. In this paper, we propose a new Microarray data classification ...

متن کامل

A Maximally Diversified Multiple Decision Tree Algorithm for Microarray Data Classification

We investigate the idea of using diversified multiple trees for Microarray data classification. We propose an algorithm of Maximally Diversified Multiple Trees (MDMT), which makes use of a set of unique trees in the decision committee. We compare MDMT with some well-known ensemble methods, namely AdaBoost, Bagging, and Random Forests. We also compare MDMT with a diversified decision tree algori...

متن کامل

Detection of some Tree Species from Terrestrial Laser Scanner Point Cloud Data Using Support-vector Machine and Nearest Neighborhood Algorithms

acquisition field reference data using conventional methods due to limited and time-consuming data from a single tree in recent years, to generate reference data for forest studies using terrestrial laser scanner data, aerial laser scanner data, radar and Optics has become commonplace, and complete, accurate 3D data from a single tree or reference trees can be recorded. The detection and identi...

متن کامل

Simulation of Smoke Emission from Fires in High-Rise Buildings Using the 3D Model Generated from 2-Dimensional Cadastral Data

Having a 3-Dimensional model of high-rise buildings can be used in disaster management such as fire cases to reduce casualties. The fundamental dilemma in 3D building modeling is the unavailability of suitable data sources. However, available cadastral 2D maps could be used as low-cost and attainable resources for 3D building modeling. Smoke will be a great threat to people's health during a f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1612.05888  شماره 

صفحات  -

تاریخ انتشار 2016